261 research outputs found

    UPMASK: unsupervised photometric membership assignment in stellar clusters

    Full text link
    We develop a method for membership assignment in stellar clusters using only photometry and positions. The method, UPMASK, is aimed to be unsupervised, data driven, model free, and to rely on as few assumptions as possible. It is based on an iterative process, principal component analysis, clustering algorithm, and kernel density estimations. Moreover, it is able to take into account arbitrary error models. An implementation in R was tested on simulated clusters that covered a broad range of ages, masses, distances, reddenings, and also on real data of cluster fields. Running UPMASK on simulations showed that it effectively separates cluster and field populations. The overall spatial structure and distribution of cluster member stars in the colour-magnitude diagram were recovered under a broad variety of conditions. For a set of 360 simulations, the resulting true positive rates (a measurement of purity) and member recovery rates (a measurement of completeness) at the 90% membership probability level reached high values for a range of open cluster ages (107.1109.510^{7.1}-10^{9.5} yr), initial masses (0.510×1030.5-10\times10^3M_{\sun}) and heliocentric distances (0.54.00.5-4.0 kpc). UPMASK was also tested on real data from the fields of the open cluster Haffner~16 and of the closely projected clusters Haffner~10 and Czernik~29. These tests showed that even for moderate variable extinction and cluster superposition, the method yielded useful cluster membership probabilities and provided some insight into their stellar contents. The UPMASK implementation will be available at the CRAN archive.Comment: 12 pages, 13 figures, accepted for publication in Astronomy and Astrophysic

    The first analytical expression to estimate photometric redshifts suggested by a machine

    Get PDF
    We report the first analytical expression purely constructed by a machine to determine photometric redshifts (zphotz_{\rm phot}) of galaxies. A simple and reliable functional form is derived using 41,21441,214 galaxies from the Sloan Digital Sky Survey Data Release 10 (SDSS-DR10) spectroscopic sample. The method automatically dropped the uu and zz bands, relying only on gg, rr and ii for the final solution. Applying this expression to other 1,417,1811,417,181 SDSS-DR10 galaxies, with measured spectroscopic redshifts (zspecz_{\rm spec}), we achieved a mean (zphotzspec)/(1+zspec)0.0086\langle (z_{\rm phot} - z_{\rm spec})/(1+z_{\rm spec})\rangle\lesssim 0.0086 and a scatter σ(zphotzspec)/(1+zspec)0.045\sigma_{(z_{\rm phot} - z_{\rm spec})/(1+z_{\rm spec})}\lesssim 0.045 when averaged up to z1.0z \lesssim 1.0. The method was also applied to the PHAT0 dataset, confirming the competitiveness of our results when faced with other methods from the literature. This is the first use of symbolic regression in cosmology, representing a leap forward in astronomy-data-mining connection.Comment: 6 pages, 4 figures. Accepted for publication in MNRAS Letter

    Detecting stars, galaxies, and asteroids with Gaia

    Full text link
    (Abridged) Gaia aims to make a 3-dimensional map of 1,000 million stars in our Milky Way to unravel its kinematical, dynamical, and chemical structure and evolution. Gaia's on-board detection software discriminates stars from spurious objects like cosmic rays and Solar protons. For this, parametrised point-spread-function-shape criteria are used. This study aims to provide an optimum set of parameters for these filters. We developed an emulation of the on-board detection software, which has 20 free, so-called rejection parameters which govern the boundaries between stars on the one hand and sharp or extended events on the other hand. We evaluate the detection and rejection performance of the algorithm using catalogues of simulated single stars, double stars, cosmic rays, Solar protons, unresolved galaxies, and asteroids. We optimised the rejection parameters, improving - with respect to the functional baseline - the detection performance of single and double stars, while, at the same time, improving the rejection performance of cosmic rays and of Solar protons. We find that the minimum separation to resolve a close, equal-brightness double star is 0.23 arcsec in the along-scan and 0.70 arcsec in the across-scan direction, independent of the brightness of the primary. We find that, whereas the optimised rejection parameters have no significant impact on the detectability of de Vaucouleurs profiles, they do significantly improve the detection of exponential-disk profiles. We also find that the optimised rejection parameters provide detection gains for asteroids fainter than 20 mag and for fast-moving near-Earth objects fainter than 18 mag, albeit this gain comes at the expense of a modest detection-probability loss for bright, fast-moving near-Earth objects. The major side effect of the optimised parameters is that spurious ghosts in the wings of bright stars essentially pass unfiltered.Comment: Accepted for publication in A&

    Using gamma regression for photometric redshifts of survey galaxies

    Get PDF
    Machine learning techniques offer a plethora of opportunities in tackling big data within the astronomical community. We present the set of Generalized Linear Models as a fast alternative for determining photometric redshifts of galaxies, a set of tools not commonly applied within astronomy, despite being widely used in other professions. With this technique, we achieve catastrophic outlier rates of the order of ~1%, that can be achieved in a matter of seconds on large datasets of size ~1,000,000. To make these techniques easily accessible to the astronomical community, we developed a set of libraries and tools that are publicly available.Comment: Refereed Proceeding of "The Universe of Digital Sky Surveys" conference held at the INAF - Observatory of Capodimonte, Naples, on 25th-28th November 2014, to be published in the Astrophysics and Space Science Proceedings, edited by Longo, Napolitano, Marconi, Paolillo, Iodice, 6 pages, and 1 figur

    The overlooked potential of Generalized Linear Models in astronomy-II: Gamma regression and photometric redshifts

    Get PDF
    Machine learning techniques offer a precious tool box for use within astronomy to solve problems involving so-called big data. They provide a means to make accurate predictions about a particular system without prior knowledge of the underlying physical processes of the data. In this article, and the companion papers of this series, we present the set of Generalized Linear Models (GLMs) as a fast alternative method for tackling general astronomical problems, including the ones related to the machine learning paradigm. To demonstrate the applicability of GLMs to inherently positive and continuous physical observables, we explore their use in estimating the photometric redshifts of galaxies from their multi-wavelength photometry. Using the gamma family with a log link function we predict redshifts from the PHoto-z Accuracy Testing simulated catalogue and a subset of the Sloan Digital Sky Survey from Data Release 10. We obtain fits that result in catastrophic outlier rates as low as ~1% for simulated and ~2% for real data. Moreover, we can easily obtain such levels of precision within a matter of seconds on a normal desktop computer and with training sets that contain merely tho nds of galaxies. Our software is made publicly available as a user-friendly package developed in Python, R and via an interactive web application. This software allows users to apply a set of GLMs to their own photometric catalogues and generates publication quality plots with minimum effort. By facilitating their ease of use to the astronomical community, this paper series aims to make GLMs widely known and to encourage their implementation in future large-scale projects, such as the Large Synoptic Survey Telescope

    A probabilistic approach to emission-line galaxy classification

    Get PDF
    We invoke a Gaussian mixture model (GMM) to jointly analyse two traditional emission-line classification schemes of galaxy ionization sources: the Baldwin-Phillips-Terlevich (BPT) and WHα\rm W_{H\alpha} vs. [NII]/Hα\alpha (WHAN) diagrams, using spectroscopic data from the Sloan Digital Sky Survey Data Release 7 and SEAGal/STARLIGHT datasets. We apply a GMM to empirically define classes of galaxies in a three-dimensional space spanned by the log\log [OIII]/Hβ\beta, log\log [NII]/Hα\alpha, and log\log EW(Hα{\alpha}), optical parameters. The best-fit GMM based on several statistical criteria suggests a solution around four Gaussian components (GCs), which are capable to explain up to 97 per cent of the data variance. Using elements of information theory, we compare each GC to their respective astronomical counterpart. GC1 and GC4 are associated with star-forming galaxies, suggesting the need to define a new starburst subgroup. GC2 is associated with BPT's Active Galaxy Nuclei (AGN) class and WHAN's weak AGN class. GC3 is associated with BPT's composite class and WHAN's strong AGN class. Conversely, there is no statistical evidence -- based on four GCs -- for the existence of a Seyfert/LINER dichotomy in our sample. Notwithstanding, the inclusion of an additional GC5 unravels it. The GC5 appears associated to the LINER and Passive galaxies on the BPT and WHAN diagrams respectively. Subtleties aside, we demonstrate the potential of our methodology to recover/unravel different objects inside the wilderness of astronomical datasets, without lacking the ability to convey physically interpretable results. The probabilistic classifications from the GMM analysis are publicly available within the COINtoolbox (https://cointoolbox.github.io/GMM\_Catalogue/).Comment: Accepted for publication in MNRA

    The overlooked potential of Generalized Linear Models in astronomy-II: Gamma regression and photometric redshifts

    Get PDF
    Machine learning techniques offer a precious tool box for use within astronomy to solve problems involving so-called big data. They provide a means to make accurate predictions about a particular system without prior knowledge of the underlying physical processes of the data. In this article, and the companion papers of this series, we present the set of Generalized Linear Models (GLMs) as a fast alternative method for tackling general astronomical problems, including the ones related to the machine learning paradigm. To demonstrate the applicability of GLMs to inherently positive and continuous physical observables, we explore their use in estimating the photometric redshifts of galaxies from their multi-wavelength photometry. Using the gamma family with a log link function we predict redshifts from the PHoto-z Accuracy Testing simulated catalogue and a subset of the Sloan Digital Sky Survey from Data Release 10. We obtain fits that result in catastrophic outlier rates as low as ∼ 1% for simulated and ∼ 2% for real data. Moreover, we can easily obtain such levels of precision within a matter of seconds on a normal desktop computer and with training sets that contain merely thousands of galaxies. Our software is made publicly available as a user-friendly package developed in Python, R and via an interactive web application. This software allows users to apply a set of GLMs to their own photometric catalogues and generates publication quality plots with minimum effort. By facilitating their ease of use to the astronomical community, this paper series aims to make GLMs widely known and to encourage their implementation in future large-scale projects, such as the Large Synoptic Survey Telescope

    Periodic Astrometric Signal Recovery through Convolutional Autoencoders

    Get PDF
    Astrometric detection involves a precise measurement of stellar positions, and is widely regarded as the leading concept presently ready to find earth-mass planets in temperate orbits around nearby sun-like stars. The TOLIMAN space telescope[39] is a low-cost, agile mission concept dedicated to narrow-angle astrometric monitoring of bright binary stars. In particular the mission will be optimised to search for habitable-zone planets around Alpha Centauri AB. If the separation between these two stars can be monitored with sufficient precision, tiny perturbations due to the gravitational tug from an unseen planet can be witnessed and, given the configuration of the optical system, the scale of the shifts in the image plane are about one millionth of a pixel. Image registration at this level of precision has never been demonstrated (to our knowledge) in any setting within science. In this paper we demonstrate that a Deep Convolutional Auto-Encoder is able to retrieve such a signal from simplified simulations of the TOLIMAN data and we present the full experimental pipeline to recreate out experiments from the simulations to the signal analysis. In future works, all the more realistic sources of noise and systematic effects present in the real-world system will be injected into the simulations.Comment: Preprint version of the manuscript to appear in the Volume "Intelligent Astrophysics" of the series "Emergence, Complexity and Computation", Book eds. I. Zelinka, D. Baron, M. Brescia, Springer Nature Switzerland, ISSN: 2194-728
    corecore